Parallel extreme learning machine for regression based on MapReduce

نویسندگان

  • Qing He
  • Tianfeng Shang
  • Fuzhen Zhuang
  • Zhongzhi Shi
چکیده

Regression is one of the most basic problems in data mining. For regression problem, extreme learning machine (ELM) can get better generalization performance at a much faster learning speed. However, the enlarging volume of datasets makes regression by ELM on very large scale datasets a challenging task. Through analyzing the mechanism of ELM algorithm, an efficient parallel ELM for regression is designed and implemented based on MapReduce framework, which is a simple but powerful parallel programming technique currently. The experimental results demonstrate that the proposed parallel ELM for regression can efficiently handle very large datasets on commodity hardware with a good performance on different evaluation criterions, including speedup, scaleup and sizeup. & 2012 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Batch Parallel Online Sequential Extreme Learning Machine Algorithm Based on MapReduce

With the development of technology and the widespread use of machine learning, more and more models need to be trained to mine useful knowledge from large scale data. It has become a challenging problem to train multiple models accurately and efficiently so as to make full use of limited computing resources. As one of ELM variants, online sequential extreme learning machine (OS-ELM) provides a ...

متن کامل

Dynamic Cost-sensitive Ensemble Classification based on Extreme Learning Machine for Mining Imbalanced Massive Data Streams

In order to lower the classification cost and improve the performance of the classifier, this paper proposes the approach of the dynamic cost-sensitive ensemble classification based on extreme learning machine for imbalanced massive data streams (DCECIMDS). Firstly, this paper gives the method of concept drifts detection by extracting the attributive characters of imbalanced massive data stream...

متن کامل

A parallel approximate SS-ELM algorithm based on MapReduce for large-scale datasets

Extreme Learning Machine (ELM) algorithm not only has gained much attention of many scholars and researchers, but also has been widely applied in recent years especially when dealing with big data because of its better generalization performance and learning speed. The proposal of SS-ELM (semi-supervised Extreme Learning Machine) extends ELM algorithm to the area of semi-supervised learning whi...

متن کامل

PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce

Classification and regression tree learning on massive datasets is a common data mining task at Google, yet many state of the art tree learning algorithms require training data to reside in memory on a single machine. While more scalable implementations of tree learning have been proposed, they typically require specialized parallel computing architectures. In contrast, the majority of Google’s...

متن کامل

A MapReduce based Parallel SVM for Email Classification

Support Vector Machine (SVM) is a powerful classification and regression tool. Varying approaches including SVM based techniques are proposed for email classification. Automated email classification according to messages or user-specific folders and information extraction from chronologically ordered email streams have become interesting areas in text machine learning research. This paper prese...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neurocomputing

دوره 102  شماره 

صفحات  -

تاریخ انتشار 2013